Summary-based routing for content-based event distribution networks

ABSTRACT

A system arid method for enabling highly scalable multi-node event distribution networks through the use of summary-based routing, particularly event distribution networks using a content-based publish/subscribe model to distribute information. By allowing event routers to use imprecise summaries of the subscriptions hosted by matcher nodes, an event router can eliminate itself as a bottleneck thus improving overall event distribution network throughput even though the use of imprecise summaries results in some false positive event traffic. False positive event traffic is reduced by using a filter set partitioning that provides for good subscription set locality at each matcher node, while at the same time avoiding overloading any one matcher node. Good subscription set locality is maintained by routing new subscriptions to a matcher node with a subscription summary that best covers the new subscription. Where event space partitioning is desirable, an over-partitioning scheme is described that enables load balancing without repartitioning.

FIELD OF THE INVENTION

This invention pertains generally to computer networks, and, moreparticularly, to computer networks that use a publish/subscribe model todistribute information.

BACKGROUND OF THE INVENTION

Today's computer data networks span the globe and provide an everincreasing variety of information and types of information. A popularmodel for retrieving information is the request-response model. This isa model used, for example, by the World Wide Web: a Web client requestsa Web page from a Web server and then waits until the Web serverresponds. This model is adequate for basic access to information, but asinformation consumers become more sophisticated, it quickly becomesinefficient for information consumers or information providers or both.As a general example, under the request-response model, a consumer onlyinterested in changes to an item of information (e.g., a stock price)may be required to request the information over and over again until achange is detected in the response.

A model complimentary to request-response that is becoming increasinglypopular is publish/subscribe. Under a publish/subscribe model,information consumers submit subscriptions covering events of interestto a publish/subscribe service. Then, whenever information providerspublish events to the service, consumers are notified of those events towhich they have subscribed. News alerts and stock quotes are classicexamples of information suited to distribution via a publish/subscribemodel. Examples of other applications that use a publish/subscribe modelinclude instant messaging, online auctions and electronic commerce pricedatabases.

In addition, new applications are emerging where software agents playthe role of information consumer, for example, in communicating withsensors and devices to perform automation tasks, and in monitoring andexecuting routine business-to-business transactions. Software agentspresent additional scalability challenges to the design of apublish/subscribe system because they are able to significantly increasethe total number of subscribers, they are able to handle very complexsubscriptions, and they are able to receive and process notifications ata very high rate.

Early publish/subscribe systems used a flat channel subscription model.Information consumers subscribed to a named channel and received onlyevents that an information provider published to that particularchannel. An improvement over the flat channel subscription model is toarrange the channels into a hierarchy of topics and subtopics so that asubscriber to a topic receives any events published to the topic and anyof its subtopics. Modern publish/subscribe systems are able to alloweven more fine-grained selection of events by enabling subscriptions toevents based on the content of an event.

A content-based publish/subscribe system specifies an event schema for atopic, which lists the names and types of attributes that appear in anevent. A subscription filter associated with a subscription may then bespecified as a conjunction of predicates on a subset of thoseattributes. For example, a “stock quotes” topic specifies an eventschema with three attributes: Symbol, Price, and Volume. An exampleevent is (Symbol=MSFT and Price=79.30 and Volume=40,000,000); an examplesubscription filter is (Symbol=MSFT and Price>80.00). In a furtherexample, the topic is itself an attribute of the event schema, e.g.,Topic, Symbol, Price and Volume, so that subscribing to a topic and/orsubtopic is then an aspect of the more general content-basedsubscription mechanism.

A new content-based publish/subscribe service will typically begin witha single physical server that receives and stores subscriptions fromeach service subscriber, receives events from each service publisher,performs matching of each event against the subscriptions, and sendsnotifications to subscribers with matching subscriptions. However, asuccessful service will eventually require performance beyond thecapabilities of a single physical server. For such a service, a networkof physical servers and/or a distributed system architecture isrequired.

Some prior art systems have incorporated a network of physical serversby propagating each event published to the service to each of thephysical servers in the network, but this technique has inherentinefficiencies. Some prior art systems have achieved better efficiencyby using a precise subscription filter summary. In such systems, eachphysical server that hosts subscriptions calculates a precise summary ofthe subscription filters associated with the subscriptions. The precisefilter summary is then propagated against the flow of events and used byupstream event routers to block unnecessary event traffic as early inthe route as possible.

There are problems with prior art systems that use precise subscriptionfilter summaries. One problem is that in practice a precise subscriptionfilter summary becomes so complex that event routers become a systembottleneck, degrading overall system throughput. Another problem is thatsubscription filters associated with subscriptions hosted by a serversometimes have poor locality. When that is the case,a summary of thesubscription filter is too broad to be effective in reducing eventtraffic.

To ensure continuing success for content-based publish/subscribeservices, there is a need in the art to solve such problems.

BRIEF SUMMARY OF THE INVENTION

The invention provides a system and method that address shortcomings ofthe prior art described herein above. These and other advantages of theinvention, as well as additional inventive features, will be apparentfrom the description of the invention provided herein with reference toan exemplary embodiment. The invention provides a system and method forsummary-based routing in an event distribution network. Moreparticularly, the invention is directed to enabling highly scalablemulti-node event distribution networks through the use of summary-basedrouting. The invention has a particular relevance to an eventdistribution network using a content-based publish/subscribe model todistribute information.

An event router node of an event distribution network maintains animprecise summary of the set of subscriptions hosted by each matchernode. If the event router node is overloaded, it reduces the precisionof the imprecise summaries. Reducing the precision of the imprecisesummaries allows the event router to process each event faster. If theevent router load falls beneath some high threshold, then it increasesthe precision of the imprecise summaries. Increasing the precision ofthe imprecise summaries reduces the amount of false positive trafficrouted to a matcher node. False positive traffic makes a matcher nodework harder. There is a balance point at some level of imprecision thatoptimizes the throughput of the event distribution network as a whole.

Subscriptions to be hosted by an event distribution network are dividedamong the matcher nodes of the event distribution network so as toprovide good subscription locality to the set of subscriptions hosted byeach matcher node, while at the same time avoiding overloading any onematcher node (i.e., ensuring that each set of subscriptions cover acorresponding area of event space). If a set of subscriptions has poorlocality, the imprecise summary of the set of subscriptions will resultin more false positive event traffic than if the set of subscriptionshas good locality. Providing for good subscription locality furtherenhances the throughput of the event distribution network as a whole.Good subscription locality is maintained by routing new subscriptions tothe matcher node with the subscription summary that best covers the newsubscription.

Event space partitioning is sometimes desirable but event spacepartitioning is used in circumstances where subscription locality isn'tapplicable in the same way. When event space partitioning is desirable,the event space is over-partitioned and a set of event space partitionsis assigned to each matcher node in order to provide for morefine-grained load balancing without repartitioning and, ultimately, toprovide for enhanced event distribution network throughput, particularlywhen combined with event routing using imprecise summaries.

An event distribution network node is incorporated in highly scalablemulti-node event distribution networks that use summary-based routing.Finally, an event distribution network node in accordance withparticular embodiments of the invention is implemented in the context ofan extended Web Services framework built around XML and SOAP standardsand technologies.

BRIEF DESCRIPTION OF THE DRAWINGS

While the appended claims set forth the features of the presentinvention with particularity, the invention and its advantages are bestunderstood from the following detailed description taken in conjunctionwith the accompanying drawings, of which:

FIG. 1 is a schematic diagram generally illustrating an exemplarycomputer system usable to implement an embodiment of the invention.

FIG. 2 is a schematic diagram of a publish/subscribe system inaccordance with an embodiment of the invention.

FIG. 3 is a schematic diagram of a multi-node publish/subscribe servicein accordance with an embodiment of the invention.

FIG. 4 is a schematic diagram of an event distribution network (EDN) inaccordance with an embodiment of the invention.

FIG. 5 is a graph showing a subscription rectangle and two events in atwo dimensional event space.

FIG. 6 is a schematic representation of event space showing of severalsubscription rectangles, some of which are covered by some others.

FIG. 7 is a graph showing maximum system throughput occurring at animprecise level of summary precision in accordance with an aspect of theinvention.

FIG. 8A is a schematic representation of event space showing twosubscription rectangles and their minimum bounding rectangle.

FIG. 8B is a schematic representation of event space showing threesubscription rectangles with worse locality than those of FIG. 8C.

FIG. 8C is a schematic representation of event space showing threesubscription rectangles with better locality than those of FIG. 8B.

FIG. 9A is a schematic diagram of an R-tree that indexes a precisesummary of a subscription filter set.

FIG. 9B is a schematic diagram of an R-tree that indexes an imprecisesummary of a subscription filter set in accordance with an embodiment ofthe invention.

FIG. 10A is a schematic representation of event space showing foursubscription rectangles and an event space partition.

FIG. 10B is a schematic representation of event space showing one filterset partitioning of the subscription rectangles of FIG. 10A.

FIG. 10C is a schematic representation of event space showing anotherfilter set partitioning of the subscription rectangles of FIG. 10A.

FIG. 11A is a block diagram of an event distribution network nodearchitecture in accordance with an embodiment of the invention.

FIG. 11B is a block diagram of an event distribution network nodeconfigured as an event router in accordance with an embodiment of theinvention.

FIG. 11C is a block diagram of an event distribution network nodeconfigured as an event matcher in accordance with an embodiment of theinvention.

FIG. 11D is a block diagram of an event distribution network nodeconfigured as a subscription router in accordance with an embodiment ofthe invention.

FIG. 12 is a flowchart depicting steps performed by a subscriptionrouter when selecting a matcher to host a new subscription in accordancewith an embodiment of the invention.

FIG. 13 is a flowchart depicting steps performed by an event router whenautomatically adjusting summary precision so as to maximize systemthroughput in accordance with an embodiment of the invention.

FIG. 14 is a block diagram of an extended Web Services framework used toimplement an embodiment of the invention.

DETAILED DESCRIPTION OF THE INVENTION

The invention is embodied in an event distribution network utilizingsummary-based routing. An event distribution network having event routernodes capable of maintaining an effective imprecise summary of the setof subscriptions hosted by a matcher node is disclosed herein. Routingusing imprecise summaries allows an event router to route more events atthe cost of some false positive event traffic. False positive eventtraffic reduces effective matcher node throughput of events but theoverall effect on event distribution network throughput is potentiallypositive when properly exploited. Further reductions in false positiveevent traffic are achieved by partitioning the subscriptions to behosted by the event distribution network among the plurality of matchernodes such that each filter set partition has good locality. Simulationshave shown the combined result of routing using imprecise summaries andpartitioning for good locality to give a 200% improvement in eventdistribution network throughput compared to routing using precisesummaries alone. A flexible event distribution network node suitable forbuilding event distribution networks that embody the invention is alsoherein disclosed. In an embodiment of the invention, each eventdistribution network node is capable of automatically adjusting thelevel of summary precision it utilizes to route events in order toprevent itself becoming an event distribution network bottleneck, thusoptimizing event distribution network throughput. In addition, eachevent distribution network node is capable of routing new subscriptionsto an event distribution network node with a hosted subscription setsummary that best matches the new subscription, thus maintaining goodhosted subscription set locality without re-partitioning.

Turning to the drawings, wherein like reference numerals refer to likeelements, the invention is illustrated as being implemented in asuitable computing environment. Although not required, the inventionwill be described in the general context of computer-executableinstructions, such as program modules, being executed by a personalcomputer. Generally, program modules include routines, programs,objects, components, data structures, etc. that perform particular tasksor implement particular abstract data types. Moreover, those skilled inthe art will appreciate that the invention may be practiced with othercomputer system configurations, including hand-held devices,multi-processor systems, microprocessor based or programmable consumerelectronics, network PCs, minicomputers, mainframe computers, and thelike. The invention may also be practiced in distributed computingenvironments where tasks are performed by remote processing devices thatare linked through a communications network. In a distributed computingenvironment, program modules may be located in both local and remotememory storage devices. The term computer system may be used to refer toa system of computers such as may be found in a distributed computingenvironment.

FIG. 1 illustrates an example of a suitable computing system environment100 on which the invention may be implemented. The computing systemenvironment 100 is only one example of a suitable computing environmentand is not intended to suggest any limitation as to the scope of use orfunctionality of the invention. Neither should the computing environment100 be interpreted as having any dependency or requirement relating toany one or combination of components illustrated in the exemplaryoperating environment 100. Although one embodiment of the invention doesinclude each component illustrated in the exemplary operatingenvironment 100, another more typical embodiment of the inventionexcludes non-essential components, for example, input/output devicesother than those required for network communications.

The invention is operational with numerous other general purpose orspecial purpose computing system environments or configurations.Examples of well known computing systems, environments, and/orconfigurations that may be suitable for use with the invention include,but are not limited to: personal computers, server computers, hand-heldor laptop devices, tablet devices, multiprocessor systems,microprocessor-based systems, set top boxes, programmable consumerelectronics, network PCs, minicomputers, mainframe computers,distributed computing environments that include any of the above systemsor devices, and the like.

The invention may be described in the general context ofcomputer-executable instructions, such as program modules, beingexecuted by a computer. Generally, program modules include routines,programs, objects, components, data structures, etc. that performparticular tasks or implement particular abstract data types. Theinvention may also be practiced in distributed computing environmentswhere tasks are performed by remote processing devices that are linkedthrough a communications network. In a distributed computingenvironment, program modules may be located in local and/or remotecomputer storage media including memory storage devices.

With reference to FIG. 1, an exemplary system for implementing theinvention includes a general purpose computing device in the form of acomputer 110. Components of the computer 110 may include, but are notlimited to, a processing unit 120, a system memory 130, and a system bus121 that couples various system components including the system memoryto the processing unit 120. The system bus 121 may be any of severaltypes of bus structures including a memory bus or memory controller, aperipheral bus, and a local bus using any of a variety of busarchitectures. By way of example, and not limitation, such architecturesinclude Industry Standard Architecture (ISA) bus, Micro ChannelArchitecture (MCA) bus, Enhanced ISA (EISA) bus, Video ElectronicsStandards Association (VESA) local bus, and Peripheral ComponentInterconnect (PCI) bus also known as Mezzanine bus.

The computer 110 typically includes a variety of computer readablemedia. Computer readable media can be any available media that can beaccessed by the computer 110 and includes both volatile and nonvolatilemedia, and removable and non-removable media. By way of example, and notlimitation, computer readable media may comprise computer storage mediaand communication media. Computer storage media includes volatile andnonvolatile, removable and non-removable media implemented in any methodor technology for storage of information such as computer readableinstructions, data structures, program modules or other data. Computerstorage media includes, but is not limited to, RAM, ROM, EEPROM, flashmemory or other memory technology, CD-ROM, digital versatile disks (DVD)or other optical disk storage, magnetic cassettes, magnetic tape,magnetic disk storage or other magnetic storage devices, or any othermedium which can be used to store the desired information and which canbe accessed by the computer 110. Communication media typically embodiescomputer readable instructions, data structures, program modules orother data in a modulated data signal such as a carrier wave or othertransport mechanism and includes any information delivery media. Theterm “modulated data signal” means a signal that has one or more of itscharacteristics set or changed in such a manner as to encode informationin the signal. By way of example, and not limitation, communicationmedia includes wired media such as a wired network or direct-wiredconnection, and wireless media such as acoustic, RF, infrared and otherwireless media. Combinations of the any of the above should also beincluded within the scope of computer readable media.

The system memory 130 includes computer storage media in the form ofvolatile and/or nonvolatile memory such as read only memory (ROM) 131and random access memory (RAM) 132. A basic input/output system 133(BIOS), containing the basic routines that help to transfer informationbetween elements within computer 110, such as during start-up, istypically stored in ROM 131. RAM 132 typically contains data and/orprogram modules that are immediately accessible to and/or presentlybeing operated on by processing unit 120. By way of example, and notlimitation, FIG. 1 illustrates operating system 134, applicationprograms 135, other program modules 136 and program data 137.

The computer 110 may also include other removable/non-removable,volatile/nonvolatile computer storage media. By way of example only,FIG. 1 illustrates a hard disk drive 141 that reads from or writes tonon-removable, nonvolatile magnetic media, a magnetic disk drive 151that reads from or writes to a removable, nonvolatile magnetic disk 152,and an optical disk drive 155 that reads from or writes to a removable,nonvolatile optical disk 156 such as a CD ROM or other optical media.Other removable/non-removable, volatile/nonvolatile computer storagemedia that can be used in the exemplary operating environment include,but are not limited to, magnetic tape cassettes, flash memory cards,digital versatile disks, digital video tape, solid state RAM, solidstate ROM, and the like. The hard disk drive 141 is typically connectedto the system bus 121 through a non-removable memory interface such asinterface 140, and magnetic disk drive 151 and optical disk drive 155are typically connected to the system bus 121 by a removable memoryinterface, such as interface 150.

The drives and their associated computer storage media, discussed aboveand illustrated in FIG. 1, provide storage of computer readableinstructions, data structures, program modules and other data for thecomputer 110. In FIG. 1, for example, hard disk drive 141 is illustratedas storing operating system 144, application programs 145, other programmodules 146 and program data 147. Note that these components can eitherbe the same as or different from operating system 134, applicationprograms 135, other program modules 136, and program data 137. Operatingsystem 144, application programs 145, other program modules 146, andprogram data 147 are given different numbers hereto illustrate that, ata minimum, they are different copies. A user may enter commands andinformation into the computer 110 through input devices such as atablet, or electronic digitizer, 164, a microphone 163, a keyboard 162and pointing device 161, commonly referred to as a mouse, trackball ortouch pad. Other input devices (not shown) may include a joystick, gamepad, satellite dish, scanner, or the like. These and other input devicesare often connected to the processing unit 120 through a user inputinterface 160 that is coupled to the system bus, but may be connected byother interface and bus structures, such as a parallel port, game portor a universal serial bus (USB). A monitor 191 or other type of displaydevice is also connected to the system bus 121 via an interface, such asa video interface 190. The monitor 191 may also be integrated with atouch-screen panel or the like. Note that the monitor and/or touchscreen panel can be physically coupled to a housing in which thecomputer device 110 is incorporated, such as in a tablet-type personalcomputer. In addition, computers such as the computing device 110 mayalso include other peripheral output devices such as speakers 197 andprinter 196, which may be connected through an output peripheralinterface 194 or the like.

The computer 110 may operate in a networked environment using logicalconnections to one or more remote computers, such as a remote computer180. The remote computer 180 may be a personal computer, a server, arouter, a network PC, a peer device or other common network node, andtypically includes many or all of the elements described above relativeto the computer 110, although only a memory storage device 181 has beenillustrated. in FIG. 1. The logical connections depicted in FIG. 1include a local area network (LAN) 171 and a wide area network (WAN)173, but may also include other networks. Such networking environmentsare commonplace in offices, enterprise-wide computer networks, intranetsand the Internet. For example, in the present invention, the computer110 may comprise the source machine from which data is being migrated,and the remote computer 180 may comprise the destination machine. Notehowever that source and destination machines need not be connected by anetwork or any other means, but instead, data may be migrated via anymedia capable of being written by the source platform and read by thedestination platform or platforms.

When used in a LAN networking environment, the computer 110 is connectedto the LAN 171 through a network interface or adapter 170. When used ina WAN networking environment, the computer 110 typically includes amodem 172 or other means for establishing communications over the WAN173, such as the. Internet. The modem 172, which may be internal orexternal, may be connected to the system bus 121 via the user inputinterface 160 or other appropriate mechanism. In a networkedenvironment, program modules depicted relative to the computer 110, orportions thereof, may be stored in the remote memory storage device. Byway of example, and not limitation, FIG. 1 illustrates remoteapplication programs 185 as residing on memory device 181. It will beappreciated that the network connections shown are exemplary and othermeans of establishing a communications link between the computers may beused.

In the description that follows, the invention will be described withreference to acts and symbolic representations of operations that areperformed by one or more computers, unless indicated otherwise. As such,it will be understood that such acts and operations, which are at timesreferred to as being computer-executed, include the manipulation by theprocessing unit of the computer of electrical signals representing datain a structured form. This manipulation transforms the data or maintainsit at locations in the memory system of the computer, which reconfiguresor otherwise alters the operation of the computer in a manner wellunderstood by those skilled in the art. The data structures where datais maintained are physical locations of the memory that have particularproperties defined by the format of the data. However, while theinvention is being described in the foregoing context, it is not meantto be limiting as those of skill in the art will appreciate that variousof the acts and operation described hereinafter may also be implementedin hardware.

FIG. 2 depicts an exemplary publish/subscribe system in accordance withan embodiment of the invention. In an embodiment of the invention,subscribers 202, 204 place subscriptions with a publish/subscribeservice 206. The publish/subscribe service 206 hosts a subscriptionplaced with the service 206 by one of the subscribers 202, 204. Apublisher 208 publishes events to the service 206. When an event ispublished, the service 206 examines each of the subscriptions hosted bythe service 206. If the event matches a subscription, the subscriberthat placed the subscription is notified of the event. It is typical fora plurality of subscribers to receive notification of the same event.The service 206 may notify subscribers 202, 204 directly. Alternatively,the publish/subscribe service 206 may notify subscriber 206 via anotification service 210. The notification service 210 is optional butit is often the case that one is a part of a practical publish/subscribesystem, for example, to provide a bridge between networks utilizingdiffering communication protocols and/or to store notifications duringsubscriber network connectivity interruptions. It is typically a goal ofthe publish/subscribe service 206 to notify the subscribers 202, 204 ofa matching event as soon as possible.

As previously discussed, it is desirable to utilize a multi-nodearchitecture to implement the publish/subscribe service 206. One reasonis that a multi-node architecture is capable of being more reliable thana single node architecture, for example, the nodes may be arranged suchthat the failure of a single node does not disable the publish/subscribeservice 206. Another reason is that a multi-node architecture betterovercomes scalability hurdles by taking full advantage of parallelprocessing facilities, for example, in a distributed computingenvironment. Typically, one of the most process intensive aspects of thepublish/subscribe service 206 is the matching of each published event toeach of the hosted subscriptions. As a result, one goal of a multi-nodearchitecture is simply to have several matcher nodes. However, toachieve high levels of efficiency, other node types are desirable.

FIG. 3 depicts one example of a multl-node publish/subscribe service inaccordance with an embodiment of the invention. In this example, newsubscriptions 302 are received by a dedicated subscription router node304. The subscription router 304 then assigns the new subscription toone of two event matcher nodes 308, 306. Once assigned to one of theevent matcher nodes 308, 306, a subscription remains hosted by thatevent matcher node unless reassigned. Although this example shows onlytwo event matcher nodes 308, 306, embodiments in accordance with theinvention are not so limited.

Newly published events 310 are received by a dedicated event router node312. The event router 312 routes the newly published event to all, someor none of the event matcher nodes 308, 306 in the publish/subscribeservice 206. It is typically a goal of the event router 312 to route anewly published event to less than all of the event matcher nodes 308,306 if possible, in order to reduce the number of events received by thematcher nodes 308, 306.

One of the dedicated matcher nodes 308, 306 attempts to match eachreceived event to each of the subscriptions hosted by the matcher nodes308, 306. In an embodiment of the invention an event notification 314 isgenerated immediately for each match that occurs, but there aretechniques to improve notification efficiency. For example, in one suchtechnique, when a match occurs, a subscriber associated with thematching subscription is added to a list of subscribers to be notifiedof the newly published event. Once each of the subscriptions hosted byone of the matcher nodes 308, 306 have been examined, the ready one ofthe matcher nodes 308, 306 then generates the event notification 314 foreach subscriber on the list of subscribers to be notified. In anembodiment of the invention that includes a notification service orwhere, for example, the underlying network transport mechanism supportsa multi-cast facility, each of the matcher nodes 308, 306 generates asingle notification for the event notification 314 that includes thelist of subscribers to be notified.

For clarity, FIG. 3 depicts a relatively simple architecture fordistributing events from publishers to subscribers, but aspects of theinvention are able to be incorporated into much more sophisticatedarchitectures and in fact, the benefits provided by the invention areeven greater in more complex architectures. In an embodiment of theinvention, an event distribution network (EDN) includes any number ofnodes, each capable of communicating with any other, each capable ofreceiving events and subscriptions, each capable of performing matchingand generating notifications. An EDN node is able to be dedicated to aparticular role, such as, event router or matcher. However, an EDN nodeis able to serve multiple roles simultaneously, for example, eventrouter and matcher. Furthermore, an EDN node is able to change its roleor roles over time in order to, for example, adapt to changing eventtraffic conditions. In what follows, unless explicitly stated otherwise,matcher node, for example, may be read as “EDN node serving in the roleof matcher.” Event router node may be read as “EDN node serving in therole of event router,” and so on.

FIG. 4 depicts one example of an event distribution network suitable forincorporating aspects of the invention. In the illustrated example, someEDN nodes 402, 404, 406 are well placed in the network to serve in therole of event router and the majority of the resources of those nodes isallocated to that role. Other EDN nodes 408, 410, 412, 414 have themajority of their resources allocated to the role of matcher. Each EDNnode 402, 404, 406, 408, 410, 412, 414 allocates some resources to therole of subscription router. One EDN node 406 initially dedicates mostof its resources to the role of event router, but over time it allocatesmore and more resources to the role of matcher in order to optimizeutilization of the EDN node's resources.

Where multiple EDN nodes serve in the role of event router, they may beorganized hierarchically in order, for example, to minimize eventtraffic within the event distribution network. For example, referring toFIG. 4, one EDN node 402 serves as a primary event router. Other EDNnodes 404, 406 serve as secondary event routers. A primary event routeronly routes events to secondary event routers. Such routing hierarchiesmay be hardwired but typically they are not, and in fact, they may beconfigured automatically and change dynamically utilizing techniqueswell known to the art.

Before going into the details of implementing an EDN node suitable foruse in an event distribution network that incorporates aspects of theinvention, it will be helpful to further describe aspects of events,subscriptions and related concepts.

In a content-based publish/subscribe system suitable for incorporatingaspects of the invention, an event comprises one or more namedattributes. Such an event is said to occur in an event space with adimension corresponding to each attribute, for example, an event withtwo named attributes is a two dimensional event and occurs in a twodimensional event space. As another example, an event with namedattributes: Symbol, Price and Volume, is a three dimensional event andoccurs in a three dimensional event space. For clarity, the inventionwill be described with reference to a two dimensional event space butthose of skill in the art will appreciate that the invention is not solimited and in fact, provides even greater benefits when incorporatedinto embodiments that distribute events with higher dimensionality.

In an embodiment of the invention, a subscription comprises asubscription filter and a notification address. In an embodiment of theinvention, a notification address comprises a communications protocoladdress of a subscriber, for example, an internet protocol (IP) address.In an alternate embodiment of the invention, a notification addresscomprises a communications protocol address of a notification serviceand a device-independent subscriber identifier, for example, a .NETnotification service and a .NET Passport ID.

In an embodiment of the invention, a subscription filter comprises anexpression that defines a set of events. In an embodiment of theinvention, a subscription filter comprises a conjunction of predicateson a subset of the attributes of an event, for example, Symbol=MSFT andPrice>80.00 and Price<120.00. In an embodiment of the invention, asubscription filter may be thought of as describing a rectangle of eventspace (or, equivalently, a volume of event space where the event spacehas more than two dimensions).

FIG. 5 illustrates two events and a subscription filter in a twodimensional event space. Event attribute A is capable of taking on atleast four values: a₁, a₂, a₃ and a₄ such that a₁<a₂<a₃<a₄. Eventattribute B is capable of taking on at least four values: b₁, b₂, b₃andb₄ such that b₁<b₂<b₃<b₄. One event E₁ is shown occurring at point (a₂,b₂) in event space. A second event E₂ is shown occurring at point (a₃,b₄) in event space. A subscription filter is shown as a rectangle ofevent space described by (A>a₁ and A<a₄ and B >b₁ and B<b₃). One eventE₁ is within the rectangle of event space. That event E₁ matches thesubscription filter. Equivalently, that event E₁ falls within thesubscription filter. A second event E₂ is not within the rectangle ofevent space. That event E₂ does not match the subscription filter.Equivalently, that event E₂ does not fall within the subscriptionfilter. An event matches a subscription if it matches the subscriptionfilter associated with the subscription.

A publish/subscribe service suitable for incorporating aspects of theinvention hosts a plurality of subscriptions. The subscription filtersassociated with the subscriptions may be visualized as a plurality ofrectangles in an event space. Some of the filter rectangles may overlap.Some of the filter rectangles may entirely cover other rectangles.Popular areas of the event space may be covered by many filterrectangles. There may be areas of event space that are not covered by afilter rectangle. FIG. 6 illustrates several filter rectangles in anevent space. Filter rectangles 602 and 604 overlap, as do filterrectangles 606 and 608. Filter rectangles 602 and 604 are covered byfilter rectangle 610. Filter rectangle 612 is covered by filterrectangle 614, which is in turn covered by filter rectangle 606. Filterrectangles 616, 614, 618 and 612 are covered by filter rectangle 606.

A subscription filter describes a rectangle of event space. A set ofsubscription filters typically describe a more complex area of eventspace. It is typically possible to precisely describe that complex areaof event space with less than all of the subscription filters in a set,for example, if some of the subscription filters in the set entirelycover others. A subscription filter in a set that is not entirelycovered by another subscription filter is a maximal element of the set.In an embodiment of the invention, the maximal elements of asubscription filter set are a precise summary of the subscription filterset. A precise summary of a subscription filter set precisely describesthe same area of event space as does the subscription filter set. Forexample, in the set of subscription filter rectangles illustrated inFIG. 6, filter rectangles 606, 610 and 608 are a precise summary of theset.

Subscription filter set summaries are useful for making routingdecisions in an event distribution network. One of the behaviorstypically expected of a publish/subscribe service is that eachsubscriber that has subscribed to an event will receive notification ofthe publication of the event. In a multi-node publish/subscribe servicethat meets this expectation, a newly published event must be routed toeach matcher node that hosts a subscription to the event. One of theways that an event router can meet this requirement is to route eachevent to each matcher node. Prior art event routers have used this “nosummary” method, also known as multicast. Another way that an eventrouter can meet this requirement is to route to a matcher only thoseevents that fall within a precise summary of the subscription filter setassociated with the subscriptions hosted by the matcher. Prior art eventrouters have used this “precise summary” method.

The no summary method of event routing and the precise summary method ofevent routing are two extremes of a spectrum of summary precisions thatare useful in an embodiment of the invention for routing events in anevent distribution network. An event router utilizing the no summarymethod is relatively simple to implement and requires relatively littleprocessing power, but each matcher receives a relatively high number ofevents that do not match any of the subscriptions. hosted by thematcher. An event router utilizing the precise summary method routesonly those events to a matcher that will result in a match with one ormore subscriptions hosted by the matcher. However, this event routermust maintain a precise summary of the subscription filter setassociated with the subscriptions hosted by a matcher for each of thematchers to which it routes events. In addition, the event router mustitself match each newly published event to each precise summary.

In an embodiment of the invention, a goal of an event distributionnetwork is to maximize the number of events that can be distributed in agiven time period, that is, to maximize event distribution networkthroughput. Factors limiting throughput in an event distribution networkinclude network bandwidth, matcher node throughput and event router nodethroughput. An event router node utilizing the no summary method and anevent router node utilizing the precise summary method affect thesethroughput limits in different ways.

An event router node utilizing the no summary method to route events isnot typically itself a limit to overall event distribution networkthroughput because of its low complexity. However, a matcher node mustexpend at least some resources to determine that an event does not matchany of the subscriptions that it hosts. This is one reason that thethroughput of a matcher node in an event distribution network utilizingno.summary event routing is typically lower than the throughput of amatcher node in an event distribution network utilizing precise summaryevent routing. In addition, event distribution networks utilizing nosummary event routing may require higher bandwidth and/or encounter abandwidth limit to throughput.

An event router node utilizing the precise summary method of eventrouting reduces the number of events routed to a matcher node to aminimum necessary for correct operation. To be able to do so, however,in an embodiment of the invention the precise summaries used by an eventrouter node are so complex that event router node throughput is reduced.Event router node throughput is reduced to the point that the eventrouter node becomes a throughput limit, a bottleneck, for the eventdistribution network.

FIG. 7 graphically illustrates a theoretical effect of, in an embodimentof the invention, varying the level of summary precision used by anevent router on event distribution network throughput (relative to eventdistribution networks that use no summary event routing or precisesummary event routing). The horizontal axis is the level of summaryprecision used by an event router. The right limit of the graph is 100%summary precision, corresponding to an event router utilizing precisesummaries. The left limit of the graph is 0% summary precision,corresponding to an event router utilizing no summary. A point elsewhereon the graph corresponds to an event router utilizing an imprecisesummary. An imprecise summary is a subscription filter set summary (orequivalently, a subscription set summary) with a level of summaryprecision less than 100% (I.e., less than a precise summary) and greaterthan 0% (I.e., greater than no summary). The vertical axis is eventdistribution network throughput.

FIG. 7 illustrates, in accordance with an embodiment of the invention,the predicted effect that as summary precision is reduced from 100%,subscription filter summaries used by an event router become lesscomplex. In an embodiment of the invention, less complex subscriptionfiler summaries result in a lower per-event processing time at an eventrouter, so that event router throughput is higher. That is, assubscription filter summary precision is lowered, event routerthroughput is raised. However, at the same time, reducing summaryprecision below 100% results in false positive event traffic, that is,in events being routed to matcher nodes that do not host subscriptionsfor those events. In an embodiment of the invention, false positiveevent traffic increases internal event distribution network bandwidthutilization and decreases matcher node throughput. An optimal eventdistribution network throughput (TP_(Opt)) and the corresponding optimalsubscription filter summary precision used by network event routers(SP_(Opt)) occur when an increase in event distribution networkthroughput as a result of reduced summary precision is balanced by adecrease in event distribution network throughput as a result ofincreased false positive event traffic. The actual optimal level ofsummary precision varies in accordance with various embodiments of theinvention. Factors affecting the optimal.level of summary precisioninclude processing power of the matchers and event routers, eventdistribution network bandwidth, and the dimensionality of events,subscriptions and subscription summaries.

The principle illustrated by FIG. 7 is used to optimize eventdistribution network throughput in relatively simple event distributionnetworks with a single event router, such as the example depicted inFIG. 3. It is also used in more sophisticated event distributionnetworks with multiple nodes that serve in the role of event router,such as the example depicted in FIG. 4. Where there are multiple nodesserving in the role of event router, each may have the same level ofsummary precision, but this is not necessary. In addition, a matchernode may itself utilize an imprecise summary. of the subscription filterset associated with the subscriptions hosted by the matcher to prescreenevents before undertaking the full matching operation. Although thistechnique is only useful where the level of summary precision used by anevent router is significantly lower than that used by the matcher, e.g.,no summary. An embodiment of the invention includes the capability toutilize an imprecise level of summary precision in order to optimizeevent distribution network throughput.

In an embodiment of the invention, the optimal level of summaryprecision is investigated by manually adjusting summary precision duringa period of typical event traffic and measuring the throughput of theevent distribution network. Preferably starting with 100% precision andrelaxing it until the maximum of the graph illustrated in FIG. 7 isdetected. In an alternative embodiment of the invention, the optimallevel of summary precision is investigated by making a copy of thesubscriptions hosted by an event distribution network and placing thosesubscriptions in a suitable event distribution network simulationenvironment. Using the simulation environment, graphs of eventdistribution network throughput versus summary precision are constructedfor varying types of event traffic (e.g., in terms of event frequencyand distribution in event space), including typical event traffic. Theoptimal level of summary precision generally corresponds to the peak ofthe graph for typical event traffic. However, the graphs for atypicalevent traffic are given due weight (i.e., in proportion to theirlikelihood) and if it is determined that there is a different summaryprecision that results in a higher event distribution network throughputover a range of event traffic conditions, then that different summaryprecision is selected. An advantage of using a simulation environment isthat it is also able to be used to determine the effects on eventdistribution network throughput of changing event distribution networkvariables such as the number of event router nodes, event router nodecharacteristics (e.g., per-event routing time), the number of matchernodes, matcher node characteristics (e.g., per-event matching time),available bandwidth, and so on.

As new subscriptions are added to an event distribution network, theoptimal level of summary precision changes. It is desirable for theevent distribution network to be able to automatically adjust the levelof summary precision used in event routing so as to optimize eventdistribution network throughput. An event distribution networkincorporating an aspect of the invention is so enabled. In an embodimentof the invention, each event router node in the event distributionnetwork monitors its own node resource utilization. If an event routernode determines that it is overloaded, the event router decreases thelevel of summary precision it uses to route events so as to preventbecoming an event distribution network bottleneck. If the event routernode determines that it is under-utilized, the event router increasesthe level of summary precision that it uses to route events so as toreduce the negative effects of false positive event traffic on eventdistribution network throughput. There may be periods during thelifetime of the event distribution network that the optimal level ofsummary precision is 100%, corresponding to a precise summary, or 0%,corresponding to no summary (i.e., multicast), but in general theoptimal level occurs at some imprecise (i.e., intermediate) level ofsummary precision. In addition, the event router node optionallyincludes a suitable mechanism for dampening oscillations in summaryprecision, for example, increasing the time period between summaryprecision adjustments .if oscillation is detected.

An R-tree is an example of a data structure incorporated into anembodiment of the. invention to index subscription filter summaries forefficient matching and to support variable summary precision. The R-treedata structure is well known in the art, so only some of its featuresare highlighted here. An R-tree is a dynamic index structure formulti-dimensional data rectangles (e.g., subscription filterrectangles). It is a height balanced tree similar to a B-tree, with thedata rectangles residing at the leaf nodes. Each non-leaf node may havetwo or more child nodes. Each non-leaf node maintains a minimum boundingrectangle of each of the rectangles associated with its child nodes.

One of the reasons that an R-tree data structure is suitable for use inan embodiment of the invention is that a minimum bounding rectangle of aset of subscription filter rectangles is an imprecise summary of thesubscription filters. FIG. 8A illustrates the concept of a minimumbounding rectangle. A first subscription filter rectangle 802 and asecond subscription filter rectangle 804 have a minimum boundingrectangle 806. Both filter rectangles 802, 804 are required for aprecise summary. The single bounding rectangle 806 is utilized in anembodiment of the invention as an imprecise summary. An event routerutilizing the bounding rectangle 806 to route events to a matcherhosting the subscription filter rectangles 802, 804 makes the routingdecision in half the time because it must only match the event againstone rectangle instead of two. However, events that fall within theun-shaded portions of the bounding rectangle 806 would be routed,resulting in false positive event traffic.

An R-tree with the maximal elements of a subscription filter set as itsleaf nodes is used by an event router to efficiently route events at asummary precision of 100%, that is, in this case the R-tree is indexinga precise summary of the subscription filter set. In an embodiment ofthe invention, in order to reduce summary precision, the number ofR-tree leaf nodes are reduced. At least two of the R-tree leaf nodes arereplaced with a single new leaf node. The data rectangle associated withthe single new leaf node is the minimum bounding rectangle of the datarectangles associated with the leaf nodes that are replaced. Replacingonly two leaf nodes with a single new leaf node provides for a minimumstep of summary precision reduction. It may be efficient to utilizelarger steps of summary precision reduction, particularly where non-leafnodes of the R-tree typically have more than two child nodes. Increasingthe level of summary precision is the inverse of the reduction process,e.g., a single leaf node of an imprecise summary is replaced by the atleast two original leaf nodes that it summarized.

For a sufficiently large number of data rectangles, variations on thisR-tree leaf node replacement scheme are used by an embodiment of theinvention to achieve essentially arbitrary levels of summary precision,although the steps between levels of precision become larger at lowerlevels of precision. For example, if an R-tree has 100 data rectanglesrepresenting the maximal elements of a subscription filter set as itsleaf nodes and two of those leaf nodes are replaced with a single leafnode, the resulting R-tree indexes an imprecise summary of thesubscription filter set at a 99% level of summary precision. If twopairs of leaf nodes are each replaced with a single leaf node, the levelof summary precision is 98%, and so on. At a 50% level of summaryprecision, each of the original leaf nodes has been paired and each pairreplaced by a single new leaf node. In this case, a measure of the levelof precision of a summary of a subscription filter set is calculated asthe ratio of the number of rectangles in the imprecise summary to thenumber of rectangles in the precise summary.

FIG. 9A shows an example R-tree indexing a precise summary of asubscription filter set. The data rectangles associated with R-tree leafnodes 902, 904, 906, 908, 910, 912, 914, 916 are the maximal elements ofa subscription filter set. The data rectangle associated with each ofthe non-leaf nodes 918, 920, 922, 924, 926, 928, 930 is the minimumbounding rectangle of its child nodes, for example, the data rectangleassociated with a non-leaf node 918 is the minimum bounding rectangle ofits child nodes 902, 904. In this example each non-leaf node has twochild nodes. This is not necessary, although typically the number ofchild nodes is some power of two.

FIG. 9B shows the R-tree of FIG. 9A modified to index an imprecisesummary of the same subscription filter set. A pair of leaf nodes (902,904 in FIG. 9A) has been replaced by a new leaf node 932. The datarectangle associated with the new leaf node 932 is the minimum boundingrectangle of the data rectangles associated with the leaf nodes (902,904 in FIG. 9A) that have been replaced. This is an example of aparticularly efficient summary precision reduction because the minimumbounding rectangle of the replaced leaf nodes (902, 904 in FIG. 9A) wasalready maintained by their R-tree parent (918 in FIG. 9A). Where theprecise summary had eight leaf nodes, the imprecise summary has seven,so the level of summary precision is ⅞ or 87.5%. In this example, thelevel of summary precision is restored to 100% by restoring the removedleaf nodes (902, 904 in FIG. 9A) as child nodes of node 918.

A Bloom filter is another example of a data structure used by anembodiment of the invention to store subscription filter summaries forefficient matching and to support variable summary precision. Where asubscription filter comprises only equality predicates (e.g.,Symbol=MSFT) or where an important subset of subscription filterpredicates are required to be equality predicates, data structures evenmore efficient than an R-tree are used by an embodiment of theinvention.

An example of a precise summary of a subscription filter set that isused in this case is a symbol dictionary. One way of utilizing a symboldictionary for event routing is presented to provide context forcomparison with an embodiment of the invention. For each event attributethat is filtered utilizing only equality predicates (each “equalityfiltered attribute”), the symbol dictionary at an event router is loadedwith the symbols contained in the subscription filter set hosted by amatcher. The result is one or more symbol dictionaries at the eventrouter for each matcher. When a newly published event arrives at theevent router, the event router determines, for each equality filteredattribute, if the attribute value is contained in the correspondingsymbol dictionary. If it is, the event is routed to the correspondingmatcher. In an embodiment of the invention where there are multipleequality filtered attributes, the attribute value must be contained ineach corresponding symbol dictionary in order for the event to berouted.

When the number of symbols in a symbol dictionary becomes sufficientlylarge, an event router utilizing the symbol dictionary as a precisesummary will become an event distribution network bottleneck. In anembodiment of the invention, a Bloom filter is used as an imprecisesummary in place of the symbol dictionary. The Bloom filter is wellknown in the art, so only some of its features are highlighted here.Like the symbol dictionary, the Bloom filter is loaded with a set ofsymbols, but instead of a symbol list, the Bloom filter is a bit vector,making it more efficient for a large number of symbols (e.g., over100,000 symbols). Various bits are set in the vector corresponding toeach symbol. The bits to set are selected by a hash function. To checkif a symbol has been loaded into the bit vector, the symbol is runthrough the same hash function. If the corresponding bits are set in thebit vector, then there is a high probability that the symbol haspreviously been loaded into the bit vector (i.e., the symbol fallswithin the Bloom filter). The hash function is chosen to make the Bloomfilter useful as a filter but it does not guarantee that a symbol hashis unique or, for example, that a symbol hash will not collide with somecombination of other symbol hashes. There is some level of falsepositive match.

When, in an embodiment of the invention, the Bloom filter isincorporated into an event router as an imprecise summary, there is again in event router throughput at the expense of some false positiveevent traffic. The level of false positive traffic and thus theeffective level of summary precision is set by varying the width of theBloom filter bit vector. The discussion with reference to FIG. 7applies. As in the R-tree example, event distribution network throughputis increased by utilizing an imprecise summary.

In an embodiment of the invention, utilizing imprecise summariesincreases event router node throughput at the cost of some increase infalse positive event traffic. False positive traffic has a negativeimpact on bandwidth utilization and matcher node throughput. Anotherfactor effecting false positive event traffic is the way in which thesubscriptions hosted by a multi-node publish/subscribe service arepartitioned (i.e., divided up) among the matcher nodes. If thesubscriptions are partitioned randomly, as in some prior art systems,the subscriptions hosted by each matcher node will, in general, havepoor locality, that is, the filter rectangles associated with thesubscriptions will be spread out across event space, rather thanconcentrated in one portion of it.

FIG. 8B and FIG. 8C help illustrate the concept of locality. In FIG. 8B,a filter rectangle 808 is added to the subscription filter set 802, 804.The new subscription filter set 802, 804, 808 has relatively poorlocality. If the minimum bounding rectangle 810 of the subscriptionfilter set 802, 804, 808 is used as an imprecise summary by an eventrouter it will route a relatively high level of false positive eventtraffic to the matcher hosting the subscription filter set 802, 804,808. FIG. 8C provides a comparison. In FIG. 8C, a filter rectangle 812is added to the subscription filter set 802, 804. The new subscriptionfilter set 802, 804, 812 has relatively good locality. If the minimumbounding rectangle 814 of the subscription filter set 802, 804, 812 isused an as imprecise summary by an event router, it will route arelatively low level of false positive event traffic to the matcherhosting the subscription filter set 802, 804, 812. The minimum boundingrectangle of a set of subscriptions is the same as the minimum boundingrectangle of a subscription filter set associated with thesubscriptions. The summary of a set of subscriptions is the same as thesummary of a subscription filter set associated with the subscriptions.

In an embodiment of the invention, if a set of subscriptions has poorlocality then, for a given level of summary precision, an imprecisesummary of the subscriptions will be broader than if the set ofsubscriptions had good locality. A broader imprecise summary at a givenlevel of summary precision theoretically results in additional falsepositive event traffic without an increase in event router nodethroughput. In an embodiment of the invention, partitioningsubscriptions so that the set of subscriptions hosted by each matchernode has good locality results in less false positive event traffic fora given level of event router node throughput or equivalently, higherevent router node throughput for a given level of false positive eventtraffic. This in turn allows a higher event distribution networkthroughput for a given level of summary precision, that is, the peak ofthe graph illustrated by FIG. 7 is higher. In addition to partitioningfor good locality, an embodiment of the invention partitions in order toavoid overloading any one matcher node. In an embodiment of theinvention, the goal of partitioning to avoid overloading any one matchernode competes for priority with the goal of partitioning for goodlocality. Overloaded matcher nodes result in suboptimal eventdistribution network throughput, so a balance between the twopartitioning goals is desirable.

There are two basic approaches to subscription partitioning: event spacepartitioning (ESP) and filter set partitioning (FSP). In event spacepartitioning, each matcher node is assigned responsibility for an areaof event space. Event space partitioning is achieved by assigning asubscription to a matcher node if the subscription filter associatedwith the subscription falls within the area of event space for which thematcher node is responsible. If a subscription filter associated with asubscription is cut by one or more partitions, the subscription must bereplicated on the matcher nodes responsible for each of the areas thatthe subscription filter covers. An advantage of event space partitioningis that each newly published event is routed to at most one matchernode. However, the disadvantage of having to replicate a subscriptionacross multiple matcher nodes makes filter set partitioning preferable.In filter set partitioning, each subscription is assigned to a singlematcher node. If two or more subscriptions on different matcher nodeshave subscription filters that match the same event, then the event isrouted to multiple matcher nodes.

FIG. 10A, FIG. 10B and FIG. 10C show four filter rectangles 1002, 1004,1006, 1008 that are to be partitioned. Each figure shows those fourfilter rectangles partitioned in a different way. FIG. 10A shows anevent space partition (ESP) 1010 dividing the event space into twoareas: upper and lower. Each of the subscriptions associated with thefour filter rectangles will be replicated on the matcher noderesponsible for the upper event space and the matcher node responsiblefor the lower event space. FIG. 10B shows one filter set partitioning(FSP) of the four filter rectangles 1002, 1004, 1006, 1008. Two filterrectangles 1002, 1006 are assigned to one filter set partition 1012. Theother two filter rectangles 1004, 1008 are assigned to a second filterset partition 1014. An event occurring in the event space covered by theintersection of filter rectangles 1002 and 1004 will be routed to thematcher node responsible for the first partition 1012 as well as to thematcher node responsible for the second partition 1014. FIG. 10C showsanother filter set partitioning (FSP) of the four filter rectangles1002, 1004, 1006, 1008. Two filter rectangles 1002, 1004 are assigned toone filter set partition 1016. The other two filter rectangles 1006,1008 are assigned to a second filter set partition 1018. An eventoccurring in the event space covered by the intersection of filterrectangles 1002 and 1004 will be only routed to the matcher noderesponsible for the first partition 1016. The filter set partitionsdepicted in FIG. 10C have better locality than the filter set partitionsdepicted in FIG. 10B. Visually, for example, the unshaded area withinfilter set partition 1016 is less than the unshaded area within filterset partition 1012. In an embodiment of the invention, the filter setpartitions depicted in FIG. 10C result in less false positive eventtraffic than the filter set partitions depicted in FIG. 10B.

In an embodiment of the invention there are two modes of subscriptionpartitioning: offline partitioning and online partitioning. Offlinepartitioning begins with a set of subscriptions and is free fromreal-time time constraints (i.e., the partitioning occurs over minutesor hours rather than completing in seconds or in fractions of a second).Examples of when offline partitioning is used in an embodiment of theinvention include: when a successful service transitions from a singlenode publish/subscribe service to a multi-node publish/subscribeservice, and periodically repartitioning a multi-node publish/subscribeservice. In an embodiment of the invention, online partitioning takesplace in an operational event distribution network where an establishedpartitioning already exists and the assignment of a subscription to amatcher node occurs in real-time. Online partitioning is also known asnew subscription routing. In an embodiment of the invention,subscription router 304 in FIG. 3 utilizes online partitioning. In anembodiment of the invention, each EDN node 402, 404, 406, 408, 410, 412,414 in FIG. 4 utilizes online partitioning when serving in the role ofsubscription router.

An embodiment of the invention utilizes an R-tree to implement anoffline filter set partitioning in which the set of subscriptionsassigned to each matcher node has good locality and which avoidsoverloading any one matcher node. A given set of subscriptions is to bedivided into a number of partitions equal to the number of matcher nodesin an event distribution network. First, the number of children of theroot node of the R-tree is set equal to the number of destinationmatcher nodes. A top-down R-tree loading algorithm is utilized to loadthe R-tree with the subscription filter rectangles associated with theset of subscriptions. Top-down R-tree loading algorithms are known inthe art, so only some of their features are highlighted here.

At each level of the R-tree (for example, in FIG. 9A nodes 926 and 928are a level, and nodes 918, 920, 922 and 924 are another level) thefilter rectangles are partitioned as follows. First, the filterrectangles are sorted based on their minimum, maximum and centercoordinates in each dimension. Then, each cut orthogonal to thecoordinate axes that would result in a balanced and packed R-tree isconsidered. Next, the cut that minimizes a cost function is greedilyselected (i.e., selected without regard for a global optimum). Finally,like cuts are applied recursively to the newly created partitions untilthe desired number of partitions at a level is achieved. Each level isadded in turn until the desired number of rectangles per partition isachieved.

In an embodiment of the invention, the cost function is chosen to be thesum of the areas of the minimum bounding rectangles of the two candidatesubscription filter sets on the two sides of a cut. If published eventsare uniformly distributed throughout the event space, minimizing thissum corresponds well to minimizing the sum of event traffic. In anembodiment of the invention, once the R-tree is loaded, the set ofsubscription filters indexed by each sub-tree rooted at a child of theroot node is assigned to a matcher node. An R-tree has been used byprior art systems to efficiently implement the matching operation of amatcher node. In an embodiment of the invention, the set of subscriptionfilters indexed as well as the indexing sub-tree rooted at a child ofthe root node are assigned to a matcher node, to be used by the matchernode to implement the matching operation.

An embodiment of the invention utilizes an R-tree to implement an onlinefilter set partitioning in which a new subscription is assigned to amatcher node so as to maintain good locality while avoiding overloadingany one matcher node. An event distribution network has a number ofmatcher nodes, each hosting a set of subscriptions. In an embodiment ofthe invention, a subscription router node utilizes R-trees to index asubscription filter set summary for each matcher node. Both precise andimprecise summaries are suitable for new subscription routing.

In an embodiment of the invention, a new subscription arrives at thesubscription router node and then the subscription is routed to amatcher node as follows. At first each matcher node is a candidate tohost the new subscription. The subscription router node eliminates thosecandidates that are already loaded above a threshold (e.g., 2, 3 or 4times the average matcher node load). In an embodiment of the invention,allowing a higher level of load imbalance results in better subscriptionlocality at each matcher node. For each of the remaining candidates, theamount of overlap between the filter rectangle associated with the newsubscription and the data rectangles of each of the leaf nodes of eachof the candidate R-trees is calculated. The candidate matcher nodeassociated with the R-tree whose leaf node resulted in the greatestoverlap is selected. The new subscription is routed to the selectedmatcher node for hosting.

In an embodiment of the invention, the matcher node that the newsubscription is routed to, itself utilizes an R-tree to efficientlyimplement the matching operation. The filter rectangle associated withthe new-subscription is added to the R-tree in the following manner.Starting at the root node of the R-tree, each child node is a candidate.Recall that each non-leaf node maintains the minimum bounding rectangleof its child nodes. For each child node, the minimum bounding rectanglearea if the new filter rectangle were added is calculated. The childnode with the minimum increase in minimum bounding rectangle area isgreedily selected (i.e., selected without regard for a global optimum).Each child of the selected node is now a candidate, and the process isrepeated until a node is found where the new filter rectangle is able tobe added as a leaf node. In this way, filter rectangles with largeintersections will reside at leaf nodes that are close to each other inthe R-tree. In an embodiment of the invention, a subscription filtersummary implemented utilizing an R-tree will then have better localitythan if the matching R-tree was grown randomly.

In an embodiment of the invention, for a given set of subscriptions, anevent distribution network that utilizes only online filter setpartitioning has partitions with poorer locality than an eventdistribution network that utilizes only offline filter set partitioning.An initial offline partitioning in combination with ongoing onlinepartitioning results in partitions with better locality than onlinefiltering alone.

In an embodiment of the invention where some subset of the subscriptionfilter predicates are required to be equality predicates, event spacepartitioning of those predicates is desirable. Equality predicates cannot be cut by an event space partition, so that event space partitioningof equality predicates has the advantage of routing each newly publishedevent to at most one matcher node without the disadvantage of having toreplicate subscriptions. Subscription locality isn't an issue for eventspace partitioning in the same way that it is for filter set partition,so the focus for optimizing event distribution network throughput is onmatcher node load balancing.

In an embodiment of the invention, fine-grained load balancing withoutthe need for repartitioning is achieved by initially over-partitioningthe event space, e.g., forty partitions for four matcher nodes, andassigning multiple event space partitions to each matcher node. When aload imbalance is detected, the responsibility for one or morepartitions is moved from a most heavily loaded matcher node to otherless loaded matcher nodes. In an embodiment of the invention, movingresponsibility for a partition from one matcher node to anothercomprises reassignment on a partition map maintained at an event routerand migration of the subscriptions associated with the partition if thesubscriptions are not already present on the destination matcher node.

In an embodiment of the invention, assigning partitions to matcher nodestakes place in the following manner. First, each partition is assigned aweight corresponding to the load a matcher node will incur if assignedthe partition. Next, the partitions are sorted in order of decreasingweight. Then, each partition in the sequence is assigned to the matchernode that has the minimum load at that point in the sequence. Thisalgorithm is known in the art, so only some of its features aredescribed here. In an embodiment of the invention, the weight utilizedto sort a partition comprises the product of the number of subscriptionsin the partition and the number of unique equality predicates associatedwith the subscriptions in the partition. In an embodiment of theinvention, a like algorithm is utilized to reassign partitions from amost heavily loaded matcher node to less heavily loaded matcher nodes.

FIG. 11A shows an EDN node architecture suitable for use in an eventdistribution network that incorporates aspects of the invention. In anembodiment of the invention, unless otherwise specified, each EDN nodearchitecture module is capable of communicating with and invoking thefunctionality of another EDN node architecture module whether or not themodules are, for example, adjacent in the figure. FIG. 11B shows themodules utilized by an EDN node configured as a dedicated event routernode, such as event router 312 in FIG. 3. Shaded modules 1104, 1112,1108, 1116 are in active use in a dedicated event router node. Unshadedmodules 1102, 1106, 1114, 1110 are either inactive or not present in adedicated event router node. When a module is inactive, the module isnot communicated with and its functionality is not invoked. When amodule is not present, it is not possible to communicate with the moduleor to invoke its functionality. FIG. 11C shows the modules utilized byan EDN node configured as a dedicated matcher node, such as matcher 308in FIG. 3. Shaded modules 1102, 1106, 1104, 1108, 1110 are in active usein a dedicated matcher node. Unshaded modules 1112, 1114, 1116 areeither inactive or not present in a dedicated matcher node. FIG. 11Dshows the modules utilized by an EDN node configured as a dedicatedsubscription router node, such as subscription router 304 in FIG. 3.Shaded modules 1104, 1114, 1108, 1116 are in active use in a dedicatedsubscription router node. Unshaded modules 1102, 1106, 1112, 1110 areeither inactive or not present in a dedicated subscription router node.

In an embodiment of the invention, an EDN node such as EDN node 406 inFIG. 4 has each of the modules illustrated in FIG. 11A present. If anEDN node is serving in the roles of matcher, event router andsubscription router, then each of the modules is active. If an EDN nodeis serving in the roles of event router and subscription router, theshaded modules shown in both FIG. 11B and FIG. 11D are active, and soon. In addition, in an embodiment of the invention, the functionality ofan add new subscription module 1110 and a match event to subscriptionsmodule 1106 is incorporated into a subscriptions module 1102. In anembodiment of the invention, the functionality of an update route module1108 is incorporated into a subscription summary module 1104. Othervariations are possible, as will be appreciated by one of skill in theart.

Referring to FIG. 11A, a subscriptions module 1102 maintains thesubscriptions hosted by an EDN node. In an embodiment of the invention,the subscriptions module 1102 maintains an R-tree to index the filterrectangles associated with subscriptions hosted by the EDN node. Asubscription summary module 1104 maintains one or more subscriptionfilter set summaries. In an EDN node that serves in the role of matcher,a summary maintained by the subscription summary module 1104 is aprecise summary of the subscription filter set associated with thesubscriptions hosted by the EDN node (is a “precise subscriptionsummary”). In an embodiment of the invention, a dirty flag is set in thesubscriptions module 1102 whenever a new subscription is added to theset of subscriptions hosted by an EDN node and cleared whenever aprecise subscription summary is updated to include new subscriptions. Inan embodiment of the invention, the subscription summary module 1104periodically (e.g., as triggered by a timer) determines whether the setof subscriptions maintained by the subscriptions module 1102 has changed(e.g., checks a dirty flag maintained by the subscriptions module 1102).If the set of subscriptions has changed since the previoussummarization, the set of subscriptions is re-summarized. In analternative embodiment, an update route module 1108 determines whetherthe set of hosted subscriptions has changed and the update route moduletriggers an update of the precise subscription summary if required. Inanother alternative embodiment, the subscriptions module 1102 itselftriggers the precise subscription summary update whenever a newsubscription is added. An add new subscription module 1110 is anothermodule well suited to this role. Those of skill in the art willappreciate that other variations are possible.

In an EDN node that serves in the role of event router or subscriptionrouter, a subscription summary module 1106 maintains at least onesummary for each matcher node to which the EDN node routes. In anexemplary embodiment of the invention, each summary is indexed by anR-tree. In an alternative embodiment, some of the summaries aremaintained utilizing one or more Bloom filters.

The update route module 1108 propagates subscription filter setsummaries throughout the event distribution network. In an EDN nodeserving in the role of matcher, in accordance with an embodiment of theinvention, the update route module 1108 periodically determines whetherthe precise subscription summary has changed. If the precisesubscription summary has changed, the update route module 1108 sends theupdated precise subscription summary to each of the EDN nodes to whichit is directly connected. In an alternative embodiment, the update routemodule 1108 periodically determines whether the set of hostedsubscriptions has changed, triggers an update of the precisesubscription summary if required and then sends the updated precisesubscription summary to each of the EDN nodes to which it is directlyconnected. In an alternative embodiment, an updated precise subscriptionsummary is sent only to directly connected EDN nodes serving in the roleof event router.

In an EDN node serving in the role of event router and/or subscriptionroute r in an embodiment of the invention, the update route module 1108receives updated precise subscription summaries from EDN nodes to whichit is directly connected. In an embodiment of the invention, it is theresponsibility of the matcher node whose set of hosted subscriptionschanges to send an updated precise subscription summary to each EDN nodein the event distribution network. In an alternative embodiment, thematcher node sends the update only to directly connected event routernodes and it is the responsibility of an event router node to propagatethe updated precise subscription summary throughout the eventdistribution network. In an embodiment of the invention, receivedprecise subscription summary updates are submitted to the subscriptionsummary module 1104 where t hey are maintained.

In an embodiment of the invention, an event router node aggregatesprecise subscription summaries before propagating them to upstreamevent, router nodes. For example, in the exemplary event distributionnetwork illustrated in FIG. 4, the downstream EDN node 404 aggregatesprecise subscription summaries received from two EDN nodes 408, 410before sending the aggregated precise subscription summary to upstreamEDN node 402. In an embodiment of the invention, a subscription routernode aggregates precise subscription summaries before propagating themto an upstream subscription router node.

The add new subscription module 1110 receives new subscriptions routedto an EDN node for hosting and adds them to the subscriptions module1102. In an embodiment of the invention, the add new subscription module1110 adds the new subscription to an R-tree maintained by thesubscriptions module 1102 utilizing a method that maintains goodsubscription locality.

The match event module 1106 receives events routed to an EDN node formatching. A match event module matches a received event against each ofthe subscriptions maintained by the subscriptions module 1102. In anembodiment of the invention, the match event module 1106 exploits theproperties of an R-tree index maintained by the subscriptions module1102 to perform the matching operation efficiently. For eachsubscription that matches an event, the match event module 1106 sends anevent notification to the notification address associated with thesubscription. In an embodiment of the invention, the notificationaddress of each matching subscription is added to a notification listand the notifications are generated after the matching operation iscomplete. In an embodiment of the invention where a notification serviceis available, the notification list and a copy of the event are sent tothe notification service after the matching operation is complete.

A route event module 1112 receives newly published events from apublisher (not shown) as well as events routed from other EDN nodes. Areceived event is matched against a subscription filter set summary, foreach directly connected EDN node (except an EDN node that was a sourceof the event) that serves in the role of event router or matcher. Thesubscription filter set summaries are those maintained by thesubscription summary module 1104. The event is routed to each EDN nodeassociated with a subscription filter set summary that matches theevent. If the route event module 1112 resides on an EDN node that servesin the roles of event router and matcher, a received event is alsomatched against the subscription filter set summary for the EDN node. Ifthe received event does match, it is submitted to the EDN node's matchevent module 1106. In an embodiment of the invention, the same matchingengine is utilized by both the match event module 1106 and the routeevent module 1112.

A route subscription module 1114 receives new subscriptions from asubscriber (not shown). In an embodiment of the invention, a routesubscription module 1114 also receives subscriptions routed from otherEDN nodes. For each directly connected FDN node (except an EDN node thatwas a source of the subscription) that serves in the role ofsubscription router or matcher, the overlap between a receivedsubscription and the subscription filter set summary for the EDN node iscalculated. The subscription filter set summaries are those maintainedby the subscription summary module 1104. The subscription filter setsummaries are not necessarily those maintained for use by the routeevent module 1112, for example, subscription filter set summaries with alower dimensionality are utilized by an embodiment of the invention toroute subscriptions. Whereas subscription filter set summaries utilizedto route events must guarantee correct event distribution networkoperation (e.g., that every matcher node that hosts one or more matchingsubscriptions that will match the event does receive the event),subscription filter set summaries utilized to route subscriptions needonly result in an online partitioning that improves event distributionnetwork throughput. The subscription is routed to the EDN node with thegreatest overlap. If an EDN node receiving the new subscription servesin the roles of subscription router and matcher, the EDN node is also acandidate for hosting the subscription. If the subscription filter setsummary of the EDN node receiving the new subscription has the greatestoverlap with the new subscription, the new subscription is submitted tothat EDN node's add new subscription module 1110. In an embodiment ofthe invention, the same matching engine is utilized by both the matchevent module 1106 and the route subscription module 1114.

FIG. 12 depicts in more detail a part of a procedure utilized by theroute subscription module 1114 to select the routing destination for asubscription in an embodiment of the invention. The route subscriptionmodule 1114 has a list of candidate EDN nodes. At step 1202, a candidateEDN node is selected from the list. At step 1204, the candidate EDN nodeis examined to determine whether it is a matcher node capable of hostingthe new subscription rather than, for example, a dedicated subscriptionrouter. If the EDN node is not capable of hosting the new subscription,it is eliminated as a candidate and the procedure progresses to step1214. If the candidate EDN node does serve in the role of matcher, theprocedure progresses to step 1206 where it is further determined whetheror not the matcher node is overloaded. In an embodiment of theinvention, each matcher node periodically updates each subscriptionrouter node with the current load of the matcher node. If the matchernode is overloaded, it is eliminated as a candidate and the proceduremoves to step 1214 to determine whether there are any other candidates.Otherwise the procedure progresses from step 1204 to step 1208.

At step 1208, the overlap between the filter rectangle associated withthe new subscription and the candidate EDN node's subscription filtersummary is calculated. In an exemplary embodiment of the invention, theoverlap is calculated as the largest of the intersections of the filterrectangle associated with the new subscription and each of the filterrectangles comprising the subscription filter summary. In an alternateembodiment, the overlap is calculated as the cumulative intersection ofthe filter rectangle associate with the new subscription and each of thefilter rectangles comprising the subscription filter summary. At step1210, it is determined whether the overlap just calculated is themaximum overlap calculated so far in the procedure. For example, in anembodiment of the invention, the summary with the largest event spaceintersection with the new subscription is the summary that has themaximum overlap with the new subscription. Making maximum overlap a newsubscription routing criteria in an embodiment of the invention resultsin the maintenance of good hosted subscription set locality and thusminimizes new event traffic to a matcher due to the new subscription. Ifthis is the maximum overlap, the procedure commences executing step1212. Otherwise the procedure passes from step 1210 to step 1214 todetermine whether there are any more candidates. At step 1212, thedestination of the new subscription is set to be the current candidateEDN node, and the procedure passes to step 1214. At step 1214, it isdetermined whether there are any other candidates. If there are, theprocedure returns to step 1202, otherwise the procedure moves to step1216. At step 1216, the subscription is routed to the destination lastset in step 1212.

A tune routing precision module 1116 adjusts the precision of summariesmaintained by subscription summary module 1104 to prevent an EDN nodethat serves in the role of event router or subscription router frombecoming an event distribution network bottleneck. FIG. 13 depicts at ahigh level a procedure utilized by the tune routing precision module1116 in an embodiment of the invention. At step 1302, the procedurewaits for a period of time determined by a timeout value, initially setto a default value, for example, one minute, The waiting step isimportant because the tune routing precision module 1116 must not be aburden on the resources of the EDN node where it resides. When thetimeout occurs, the procedure progresses to step 1304.

At step 1304, the load of the EDN node is determined (e.g., the averageload over the last one minute) and compared to a high threshold (e.g.,96% of maximum load). If the load is above the threshold, the EDN nodeis overloaded and the procedure moves to step 1306. At step 1306, theprecision of the subscription filter summaries utilized to route eventsand/or subscriptions is reduced in order to prevent the EDN node frombecoming an event distribution network bottleneck. Otherwise theprocedure moves from step 1304 to step 1308. In contrast to step 1306,at step 1308, the precision of the subscription filter summariesutilized to route events and/or subscriptions is increased in order toreduce false positive event traffic routed to matcher nodes.

Steps 1308 and 1306 provide for good tracking of optimal summaryprecision in response to changing event distribution network conditionsbut when conditions remain constant, the procedure results in an summaryprecision setting that oscillates around the optimum. In order to dampenany oscillations, the procedure moves to step 1310. At step 1310, theprocedure examines the recent sequence of summary precision settingadjustments. If summary precision setting has been oscillating (e.g., uptick, down tick, three times in a row) without an increase in timeoutvalue, the procedure moves to step 1312. If no oscillation is detected,then the optimum summary precision setting has not been found and theprocedure moves to step 1314 where the timeout value is reset to theinitial default timeout value. At step 1312, a check is made to ensurethat scaling up the timeout value does not exceed a maximum (e.g., 24hours) and if not, then the timeout value is scaled up by some factor atstep 1316. A higher timeout scaling factor results in better summaryprecision setting stability at the expense of taking longer to detect ashift in the optimal summary precision. Once any adjustments to thetimeout value have been made, the procedure returns to the wait step1302.

In an embodiment of the invention, events, subscriptions andsubscription summary updates are implemented utilizing XML messages. Inaddition, EDN nodes are implemented in the context of an extended WebServices framework utilizing SOAP-based protocols. XML messages,SOAP-based protocols and a standard Web Services framework are known inthe art, so only some of their features are described here.

FIG. 14 illustrates the architectural components of a Web Servicesframework with event distribution network extensions (EDN-specificextensions shaded). In an embodiment of the invention, unless otherwisespecified, each component is capable of communicating with and invokingthe functionality of another component whether or not the componentsare, for example, adjacent in the figure. A Messaging Layer 1402provides the infrastructure for sending and receiving XML messagesbetween Web Services endpoints (e.g., EDN nodes). A Namespace BindingLayer 1404 maintains a hierarchical namespace (e.g., forpublish/subscribe topics or routing table entries) and associates eachname entry with a matcher class that would be instantiated to store thefilters contained in XML messages sent to that name. When a new route ortopic entry is created in the namespace, the XML creation message hasthe option of specifying a Uniform Resource Identifier (URI) thatidentifies the matching engine to utilize for handling any filteroperations associated with the topic/route. The default matcher class inthe framework is a standard X-Path Filter Matcher 1406, which is alsoutilized by the Messaging Layer 1402 for XML message dispatching.

A Base Route Manager 1408 registers a handler with the Messaging Layer1402 to receive each incoming XML message, and utilizes a namespacelayer instance and the matcher instances associated with its nameentries to make routing decisions. It also registers another handler toreceive route administration messages for creating, deleting, andenumerating route information. A Base Subscription Manager 1410registers a handler to receive each XML message related to topicmanagement, subscriptions, and event publications. In addition tosupporting pluggable matcher and namespace implementations, the eventdistribution network extensions also allow applications to extend thebase classes in order to add custom XML elements to base XMLadministrative messages and to override or include additional logic inroute or topic management.

In an embodiment of the invention, an event router node runs an EDNRoute Manager 1412 with namespace entries associated with an R-treeRoute Set provided by an EDN R-tree Matcher 1414 (instead of the defaultX-Path Filter Matcher 1406). Implementing aspects of modules 1104, 1112in FIG. 11A, an R-tree Route Set holds the set of summary R-trees, onefrom each matcher node, matches a newly published event against eachtree, and returns a list of matcher nodes associated with the summariesthat match the event. Subscription routing is implemented by calling aRoute Closest function (instead of a Route Any function) on the sameR-tree Route Set. An event router node starts an independent thread ofexecution to implement the summary precision adjustment function of FIG.11A's tune routing precision module 1116. Each subscription andsubscription filter set summary has a unique identifier associated withit which is returned to the originator. The unique identifiers areuseful for efficient update and delete by the originator.

In an embodiment of the invention, a matcher node runs the EDN RouteManager 1412 to implement R-tree subscription routing, i.e., aspects ofmodules 1114, 1108 in FIG. 11A. The matcher node also runs an EDNSubscription Manager 1416 with namespace entries associated with anR-tree Matching Engine. The R-tree Matching Engine utilizes an R-treematcher to implement a single-node filtering engine (i.e., aspects ofmodules 1102, 1106, 1110 in FIG. 1A) and another R-tree matcher toimplement a summary manager that maintains a precise summary R-tree(i.e., aspects of module 1104 in FIG. 11A). In an embodiment of theinvention, when the matcher node's routing R-tree is changed due to theinsertion of new subscriptions, the updated R-tree is sent to the eventrouter node's EDN Route Manager 1412 as an XML route update message,implementing aspects of modules 1102, 1104, 1108, 1110 in FIG. 11A.

All references, including publications, patent applications, andpatents, cited herein and are hereby incorporated by reference to thesame extent as if each reference were individually specificallyindicated to be incorporated by reference and were set forth in itsentirety herein.

The use of the terms “a” and “an” and “the” and similar referents in thecontext of describing the invention (especially in the context of thefollowing claims) are to be construed to cover both the singular and theplural, unless otherwise indicated herein or clearly contradicted bycontext. The terms “comprising,” “having,” “including,” and “containing”are. to be construed as open-ended terms (i.e., meaning “including, butnot limited to,”) unless otherwise noted. Recitation of ranges of valuesherein are merely intended to serve as a shorthand method of referringindividually to each separate value falling within the range, unlessotherwise indicated herein, and each separate value is incorporated intothe specification as if it were individually recited herein. All methodsdescribed herein can be performed in any suitable order unless otherwiseindicated herein or otherwise clearly contradicted by context. The useof any and all examples, or exemplary language (e.g., “such as”)provided herein, is intended merely to better illuminate the inventionand does not pose a limitation on the scope of the invention unlessotherwise claimed. No language in the specification should be construedas indicating any non-claimed element as essential to the practice ofthe invention.

Preferred embodiments of this invention are described herein, includingthe best mode known to the inventors for carrying out the invention.Variations of those preferred embodiments may become apparent to thoseof ordinary skill in the art upon reading the foregoing description. Theinventors expect skilled artisans to employ such variations asappropriate, and the inventors intend for the invention to be practicedotherwise than as specifically described herein. Accordingly, thisinvention includes all modifications and equivalents of the subjectmatter recited in the claims appended hereto as permitted by applicablelaw. Moreover, any combination of the above-described elements in allpossible variations thereof is encompassed by the invention unlessotherwise indicated herein or otherwise clearly contradicted by context.

1-18. (canceled)
 19. A computer-implemented method, comprising: dividingsubscriptions hosted by an event distribution network into a pluralityof subsets, one subset of subscriptions for each matcher node in theevent distribution network, such that each subset of subscriptions hasgood event space locality and covers a corresponding area of eventspace; assigning each subset of subscriptions to a matcher node;maintaining, at an event router node, a summary of the set ofsubscriptions assigned to a matcher node; and routing an event from anevent router node to a matcher node if the event falls within thesummary of the set of subscriptions assigned to the matcher node. 20.The method according to claim 19, wherein dividing subscriptions hostedby an event distribution network into a plurality of subsets, one foreach matcher node in the event distribution network, such that eachsubset of subscriptions has good event space locality and covers acorresponding area of event space comprises: setting the number of childnodes of the root node of a partitioning R-tree to be equal to thenumber of matcher nodes in an event distribution network; loading thepartitioning R-tree top-down with the subscriptions hosted by the eventdistribution network such that, for each partitioning of thesubscription filter set associated with the subscriptions, thesubscription filter set is partitioned to minimize the minimum boundingrectangle of the subscription filter subsets on each side of apartition; and choosing a subset of subscriptions to be thesubscriptions indexed by the sub-tree rooted at a child node of the rootnode of the loaded partitioning R-tree.
 21. The method according toclaim 19, further comprising: maintaining, at a subscription routernode, for each of the plurality of matcher nodes, an imprecise summaryof the subscriptions assigned to a matcher node; receiving, at asubscription router node, a new subscription to be hosted by the eventdistribution network; determining, at a subscription router node, whichof the imprecise summaries best covers the new subscription; andassigning the new subscription to the matcher node with the bestimprecise summary.
 22. The method according to claim 21, wherein eachimprecise summary comprises a plurality of data rectangles indexed by anR-tree, and wherein determining which of the imprecise summaries bestcovers the new subscription comprises: calculating the overlap betweenthe subscription filter rectangle associated with the new subscriptionand each data rectangle indexed by a leaf-node of each imprecise summaryR-tree; and choosing the imprecise summary with the data rectangle thathas the maximum overlap with the new subscription.
 23. Acomputer-readable medium having thereon computer executable instructionsfor performing a method comprising: dividing subscriptions hosted by anevent distribution network into a plurality of subsets, one for eachmatcher node in the event distribution network, such that each subset ofsubscriptions has good event space locality and covers a correspondingarea of event space; assigning each subset of subscriptions to a matchernode; maintaining, at an event router node, a summary of the set ofsubscriptions assigned to a matcher node; and routing an event from anevent router node to a matcher node if the event falls within thesummary of the set of subscriptions assigned to the matcher node.
 24. Acomputer-implemented method, comprising: partitioning an event spacesuch that the number of event space partitions exceeds the number ofmatcher nodes in an event distribution network; assigning responsibilityfor a set of event space partitions to each matcher node; assigning eachsubscription to be hosted to the at least one matcher node responsiblefor an event space partition that the subscription falls within;maintaining, at an event router node, a summary of the set ofsubscriptions assigned to a matcher node; and routing an event from anevent router node to a matcher node if the event falls within thesummary of the set of subscriptions assigned to the matcher node. 25.The method according to claim 24, wherein the event space partitions areeach orthogonal to an equality filtered dimension of the event space,wherein the number of event space partitions is a multiple of the numberof matcher nodes in an event distribution network and wherein assigninga set of event space partitions to each matcher node comprises:assigning each event space partition a weight that is the product of thenumber of subscriptions hosted by the event distribution network thatsubscribe to the event space partition and the number of unique equalitypredicates on the equality filtered dimension in the subscriptionshosted by the event distribution network that subscribe to the eventspace partition; sorting the event space partitions in order ofdecreasing weight; assigning each event space partition in sorted orderto a matcher node that has a lowest cumulative event space partitionweight at the time of assignment.
 26. The method according to claim 25,further comprising: re-calculating, after initial event space partitionassignment, each event space partition weight; repeatedly re-assigning alowest weight event space partition currently assigned to a most heavilyloaded matcher node to a matcher node that has a lowest cumulative eventspace partition weight at the time of re-assignment, until the heavilyloaded matcher node is no longer most heavily loaded.
 27. The methodaccording to claim 24, further comprising re-assigning an event spacepartition from a heavily loaded matcher node to a less heavily loadedmatcher node.
 28. A computer-readable medium having thereon computerexecutable instructions for performing a method comprising: partitioningan event space such that the number of event space partitions is amultiple of the number of matcher nodes in an event distributionnetwork; assigning responsibility for a set of event space partitions toeach matcher node; assigning each subscription to be hosted to the atleast one matcher node responsible for an event space partition that thesubscription falls within; maintaining, at an event router node, asummary of the set of subscriptions assigned to a matcher node; androuting an event from an event router node to a matcher node if theevent falls within the summary of the set of subscriptions assigned tothe matcher node. 29-37. (canceled)